image-3.png

Most profitable zip codes in New York City for short term rentals

Zillow Data

We have 25 rows i.e 25 unique zip codes and 262 columns that includes monthly median price from the year April 1996 till June 2017

Data Cleaning and Processesing: Zillow Data

Dealing with missing rows in zillow_ny_df

The above table shows the percent of missing values in each column of the dataset. I am eliminating those columns where Missing_Value_Percent is above 0% from further analysis and retaining the remaining columns

The above result shows that we have successfully eliminated all the columns with null values

The above two chunks of code were executed to compute yearly median price for every zip code in New York City. Our initial zillow dataset consisted of monthly median price which would have been cumbersome in performing further analysis on the dataset. So, I computed yearly median price for every zip code in NYC for the past 10 years. I selected median over mean to arrive at the average cost price in each zip code of NYC for the following two reasons:

Instead of having a separate column for each year in the dataset, I am transposing the dataset to include the median house price per year as a row in the dataset for each zipcode.

Exploratory Data Analysis on Zillow Data

The above trend lines of median price of each region name shows:

From the above line chart, it is difficult to discern the volatility of the median price. Hence, I will next attempt to calculate the standard deviation in the yearly median house price per zip code.

The above bar graph certainly clarifies as to which regions are highly volatile and which one's are least volatile. Based on the standard deviation of the last 10 years median house price of the properties in every zipcode of NYC, we can determine the following:

In order to predict the median price for the year 2018, 2019 & 2020, we calculated the average percentage change in the median price for the past three years to maintain the recency of how median house prices have changed in each zip code. Prediction of the median price for the year 2020 is needed to check the current median house price in MYC and evaluate the following:

The above bar graph is clearly showing that regions like 10013, 10014, 10011 are the costliest based on median price for the year 2020. Median price of 2020 is considered to be the current cost price for the further analysis

Airbnb Data

We have selected- 'id','city','state', 'zipcode','neighbourhood_group_cleansed','latitude', 'longitude', 'is_location_exact','price', 'property_type', 'bedrooms', 'availability_365','number_of_reviews' so that analysis can be done for New York city to find out popular neighbourhood, property type, price etc.

The variables that I have considered for the analysis purpose are :'id','city','state', 'zipcode','neighbourhood_group_cleansed','latitude', 'longitude', 'is_location_exact','price', 'property_type', 'bedrooms', 'availability_365','number_of_reviews'. After retaining the above columns, we have 13 variables or columns and 40753 rows in the dataset

Data Cleaning and Processesing: AirBnb Listings Data

The above operations are performed to change the datatype of coloumn 'id' from integers to string. Cleaned all the empty spaces, $ sign and punctuations from the column 'price' to convert it into float datatype

The above result confirms that all my variables are assigned with correct datatypes

Dealing with missing rows in NY_listings_df

The above results is showing the sum and percentages of missing values in each columns. Since, count and percentage of missing rows is very low as compared to number of data we have in our dataset. So, it is decided to drop all the null rows from the dataframe

Successfully deleted all the rows with null values

Our final NewYork listings dataset has 13 columns ans 4827 rows

Joining Zillow and NewYork airbnb dataframes

Exploratory Data Analysis on New York Airbnb data

Per Night Price of Properties per Neighbourhood

The above bar graph shows that the average nightly price is the highest at Manhattan neighbourhood followed by Brooklyn, Staten Island, and Queens

As per the cost price, Manhattan neighborhood is leading and followed by Brooklyn, Staten Island, and Queens. Although, Manhattan is the costliest neighborhood but revenue from rentals is also the highest. However, in order to arrive at which neighborhood is the most profitable one, we need to dig deeper to find out annual rate of return per neighborhood. The rental price in a neighborhood is corelated to the cost of that property in the neighborhood.

Per Night Price of Properties per Zipcode

The above result shows which zip codes have the most and the least potential of earning daily rental revenue

Average Nightly Price per Zipcode by Neighbourhood

The above bar chart shows that out of 22 zip codes in the New York City dataset, Manhattan neighborhood has the greatest number of zip code and Queens has only one zip code. This clearly means that within Manhattan, there are a lot of popular zip codes where rental properties are in high demand. However, in Queen there is only a single zip code which is popular for rentals. For both Staten Island and Brooklyn, there are 5 zip codes each that are popular for rentals. The above figure is also clearly showing as to which zip codes by neighborhood has the most potential for rental income and has the highest nightly rental prices. For example, zip code 10011 in Manhattan seems to be the most expensive in terms of average nightly rental price. However, to determine profits, we will also need to consider the cost price of the properties in each zip code and then compare it with the average nightly rental price.

Variability of Price or Nightly Price per Zipcode

The above boxplots shows that the zip code 10036 has the highest nightly price range with min: 58 and max: 4700. The next zip code with the highest nightly price range is 10003 (min: 60, max: 3750). However, both zip codes have outliers which is causing the range to be inflated. However, if we exclude zip codes with outliers, the zip codes with the highest variability are the following zip codes and I am listing them in descending order of variability: 10011, 10013, 10036 and 10128. Zip codes such as 10312, 10304, 10308 don't have any price variability. We need to dig further to find out in which neighborhood they are falling.

The above results made it clear that zip codes falling in Manhattan neighborhood has the most price variability which indicates that the highest fluctuation in rental price is in that neighborhood. Neighborhood such as 'Staten Island' has the least variability of price. High nightly price variability also indicates that potential for higher risk in Manhattan due to fluctuations in price. However, the fluctuation in price might also indicate seasonality with rental rates being higher in the summer as compared to winters due to variability in demand. Hence, we should keep in mind that the cash flow from rental properties in these two neighborhoods will fluctuate and should be factored into cash flow projections in the future.

Number of Properties by Zipcode

The above bar graph shows that the zipcode:11215 has the highest number of properties and zipcode:10312, 10304 has the least number of properties. We need to find out in which neighborhood these zip codes are located. But it is clear from this data that these are the most popular zip codes for rentals.

The above bar graph shows that the zip code 11215 which has the highest number of properties, is in Brooklyn neighborhood. The graph makes it clear that the neighborhoods that are the most popular in terms of short-term rentals are Brooklyn and Manhattan. The highest potential for investment in properties for short term rentals is in these 2 neighborhoods because they have the highest demand. We will need to check for the rate of return in these neighborhoods by zip code to determine the most profitable zip codes to invest in.

The above result indicate that apartment is the most popular property type by a long distance. We will next determine which neighborhoods have the most apartments.

It is clear from the above graph that Manhattan has the highest number of apartments followed by Brooklyn. Hence, Brooklyn and Manhattan are the most popular neighborhoods with the greatest number of apartments available for short-term rentals (which is the most popular property type for short term rentals). Hence, it is prudent to invest in apartments type properties in Manhattan Brooklyn.

Popularity by Zipcode

The above result shows the zipcodes that are in top five in terms of popularity are: 10306, 10308, 10003, 10022,10036

Annual Rate of Return per Zipcode

As per the above results, zipcode 10312 has the highest average annual rate of return with almost 14% followed by the zipcodes: 10304(9%), 11434(7%), 10036(6.2%).

Gross Rent Multiplier

The GRM calculation compares the property’s cost price or fair market value to the gross rental income. Using the gross rent multiplier is a good way to take a “quick look” at how fast the property will be paid off from the gross rent the property is generating. Lower the better. Based on this metric, zipcode 10312 is leading

To evaluate in which zip codes to invest, we will measure how each zip code performs against each of the following metrics:

  1. Cost: The cost of property in each zip code. The lower the cost, the better the zip code is in terms of cost.
  2. Popularity: The average number of reviews each zip code received. The average number of reviews indicate that it is a popular zip code for rentals and the higher it is, the better. Also, it is a reflection of the favorability of the zip code among renters.
  3. Annual rate of return: This key metric is considered to find out how much rate of annual return is to be expected from each zip code. The higher it is the better the zip code as per this metric.
  4. Variance in rental price: This metric measures how much variability is in the rental price for each zip code. The lower this metric, lower is the risk of investment and more stable will be the cash flow and revenue from these zip codes.
  5. Most options: This metric measures the zip codes with the greatest number of rental properties thereby representing the zip code with the most number of options to rent a property for a short duration.

In conclusion, my methodology resulted in zip codes that satisfy 3 of the metrics at most. Based on this analysis, the real estate company should prioritize investment in the following zip codes in NYC for short-term properties:

  1. Zip code 10304,10306, 10308 and 10305 in Staten Island. The zip codes in Staten Island qualified because they performed very well in the “Cost”, “Annual ROR” and “Variability in rental prices” metrics.
  2. Zip code 11434 in Queens also performed well in the “Cost”, “Annual ROR” and “Variability in rental prices” metrics and hence qualified as a good zip code to invest in.
  3. Zip code 11234, 11217 and 11201 in Brooklyn. Zip code 11217 and 11201 qualified since they did well in “Popularity”, “Variability in rental prices” and “Most Options”. Zip code 11234 on the other hand did well in “Cost”, “Annual ROR” and “Variability in rental prices”.
  4. Zip code 10025 in Manhattan. The only zip code form Manhattan qualified based on performing well on the following 3 metrics: “Popularity”, “Annual ROR” and “Most Options”.